A Local Tree Alignment-based Soft Pattern Matching Approach for Information Extraction
نویسندگان
چکیده
This paper presents a new soft pattern matching method which aims to improve the recall with minimized precision loss in information extraction tasks. Our approach is based on a local tree alignment algorithm, and an effective strategy for controlling flexibility of the pattern matching will be presented. The experimental results show that the method can significantly improve the information extraction performance.
منابع مشابه
Local Derivative Pattern with Smart Thresholding: Local Composition Derivative Pattern for Palmprint Matching
Palmprint recognition is a new biometrics system based on physiological characteristics of the palmprint, which includes rich, stable, and unique features such as lines, points, and texture. Texture is one of the most important features extracted from low resolution images. In this paper, a new local descriptor, Local Composition Derivative Pattern (LCDP) is proposed to extract smartly stronger...
متن کاملInfoXtract Location Normalization: A Hybrid Approach To Geographic References In Information Extraction
Ambiguity is very high for location names. For example, there are 23 cities named ‘Buffalo’ in the U.S. Based on our previous work, this paper presents a refined hybrid approach to geographic references using our information extraction engine InfoXtract. The InfoXtract location normalization module consists of local pattern matching and discourse co-occurrence analysis as well as default senses...
متن کاملApplying Pattern Mining to Web Information Extraction
Information extraction (IE) from semi-structured Web documents is a critical issue for information integration systems on the Internet. Previous work in wrapper induction aim to solve this problem by applying machine learning to automatically generate extractors. For example, WIEN, Stalker, Softmealy, etc. However, this approach still requires human intervention to provide training examples. In...
متن کاملInvestigation of Local-Alignment Search for Large Biological Se
One of the most important applications of a string database is storing large gene sequences. Typically the database should support queries like exact matching and pattern matching. The main hindrance to any direct approach for solving this problem is the large amount of unstructured data (of the order of GB) involved. A promising index structure for this problem is the suffix tree. In 2001, Hun...
متن کاملAn Alignment-Based Approach to Semi-supervised Relation Extraction Including Multiple Arguments
We present an alignment-based approach to semi-supervised relation extraction task including more than two arguments. We concentrate on improving not only the precision of the extracted result, but also on the coverage of the method. Our relation extraction method is based on an alignment-based pattern matching approach which provides more flexibility of the method. In addition, we extract all ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2009